-
-
Notifications
You must be signed in to change notification settings - Fork 18.8k
BUG/TST: added TypeError if object dtypes are detected in dataframe #61682
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
BUG/TST: added TypeError if object dtypes are detected in dataframe #61682
Conversation
The second concern of the issue #55114 was If |
i would expect this to attempt to operate pointwise (which would still raise on e.g. strings) |
Do you mean I should rewrite the code so that it attempts to round every column individually and then raise if there is a non-numeric column? As opposed to looking at the |
Ohh because |
im specifically thinking of object dtype columns containing numeric entries |
Ah okay, makes sense. I think the current behavior for The
Should I submit a PR to change this behavior first before implementing your pointwise solution? |
@jbrockmendel Hi, no worries if too busy to look into this right now just curious if you had any insight on the above comment for making a different PR first, thanks! |
This pull request is stale because it has been open for thirty days with no activity. Please update and respond to this comment if you're still interested in working on this. |
@@ -11227,7 +11227,10 @@ def _series_round(ser: Series, decimals: int) -> Series: | |||
return ser | |||
|
|||
nv.validate_round(args, kwargs) | |||
|
|||
if "object" in self.dtypes.values: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the thing to do is operate pointwise on object columns
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to confirm, you are saying if there is an object column present in the data frame, we should first try to round just that column and see if an error is raised? Because an object columns intended behavior here would be to raise on non-numeric entries and not raise numeric entries?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I guess I'm a little confused what operate pointwise means in this context because I think the current behavior of Series.round
is to raise if the dtype is object regardless if the entries are numeric. And because the goal of the PR is to make Series.round
and frame DataFrame.round
consistent then shouldn't we also raise if any of the columns are of dtype object? Sorry if I'm super complicating this, just trying to understand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And because the goal of the PR is to make Series.round and frame DataFrame.round consistent then shouldn't we also raise if any of the columns are of dtype object?
I'm saying the series version should behave like series.map(round)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh okay, so it should behave like series.map(round)
for object columns but every other dtype should continue to do a non-pointwise operation.
So like for
ser = pd.Series([1.22, 3.33, np.nan], dtype="float64")
round(ser)
should continue to not raise an error even though round(ser[2])
does raise?
doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.This PR addresses concern 1 of #55114 - Having consistent behavior with
Series.round
&DataFrame.round
.My solution was to raise a
TypeError
in a way similar way to #61206.I changed the following existing tests, but a little worried that might break some things, so any feedback is appreciated.
tests/frrame/methods/test_round.py
'stest_round_mixed_type
as I felt that test conflicted with the current intended behavior ofDataFrame.round
tests/copy_view/test_methods.py
'stest_round
as it was using a dataframe with strings and ints in its test